Effects of OCR Errors on Ranking and Feedback Using the Vector Space Model
نویسندگان
چکیده
We report on the performance of the vector space model in the presence of OCR errors. We show that average precision and recall is not affected for our full text document collection when the OCR version is compared to its corresponding corrected set. We do see divergence though between the relevant document rankings of the OCR and corrected collections with different weighting combinations. In particular, we observed that cosine normalization plays a considerable role in the disparity seen between the collections. Furthermore, we show that even though feedback improves retrieval for both collections, it can not be used to compensate for OCR errors caused by badly degraded documents. ∗Email: [email protected].
منابع مشابه
The Effect of Using Feedback Strategies with an Emphasis on Pharmaceutical Care standards on Nursing Students’ Knowledge and their Medication Errors
Introduction: Medication administration process is a critical aspect of professional nursing care. Prevention from medical errors needs awareness and appropriate decision-making and performance. The aim of this study is to evaluate the effect of using feedback strategies with an emphasis on medication care standards on knowledge of nursing students and their medication errors. Methods: During ...
متن کاملMeasuring the Effects of OCR Errors on Similarity Linking
The vector-space model offers an easy and robust model for Information Retrieval. Thereby, the similarities between queries and documents as well as the similarities between documents themselves are of importance. Document similarities may be used in order to generate links between documents that lead users from one document to related ones. Studies have shown that the vector-space model is rob...
متن کاملDevelopment of Lifetime Prediction Model of Lithium-Ion Battery Based on Minimizing Prediction Errors of Cycling and Operational Time Degradation Using Genetic Algorithm
Accurate lifetime prediction of lithium-ion batteries is a great challenge for the researchers and engineers involved in battery applications in electric vehicles and satellites. In this study, a semi-empirical model is introduced to predict the capacity loss of lithium-ion batteries as a function of charge and discharge cycles, operational time, and temperature. The model parameters are obtai...
متن کاملElicitation, Recast, and Meta-Linguistic Feedback in Form-Focused Exchanges: Effects of Feedback Modality on Multimedia Grammar Instruction
This research explores the effects of three computer-mediated feedback modalities, that is, elicitation, recast, and meta-linguistics, on the learning of English participial, gerund, and infinitival phrases among Iranian intermediate-level EFL learners. The overriding focus of the present study was to investigate whether different types of feedback given through form-focused computer-human exch...
متن کاملError Recovery by the Use of Sensory Feedback and Reference Measurements for Robotic Assembly
Industrial robots need instrument or parts transport to do which requires coordinate to show the robot’s instrument, parts and body. When investigating the robot location, we are usually interested in measuring its location relative to a reference coordinate system. In this system it is attempted to make the assemble direction smaller by designing the sensor board and making use of an instrumen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Manage.
دوره 32 شماره
صفحات -
تاریخ انتشار 1996